"DMWS-MTA SZTAKI"
Data Mining and Web search Research Group
Computer and Automation Research Institute, Hungarian
Academy of Sciences

VAST 2010 Challenge
Text Records - Investigations into Arms Dealing

Authors and Affiliations:

Eszter Friedman, MTA SZTAKI, feszter@info.ilab.sztaki.hu
Julianna Göbölös-Szabó, MTA SZTAKI, gobolos.szabo.julianna@gmail.com
Adrien Szabó, MTA SZTAKI, adrienn.szabo4@gmail.com,
[PRIMARY contact]
András Lukács, MTA SZTAKI, alukacs@sztaki.hu

Tool(s):

We have used two separate tools for solving the task; both were developed by the Data Mining and Web Search Group, Computer and Automation Research Institute, Hungarian Academy of Sciences. NERVis is a Named Entity Recognizer visualizer software, developed for this VAST Challenge. Special thanks to Attila Zséder for setting up the NER algorithm which is included in the tool.

The search and visualization tool PinWallVis was designed to explore and understand large data sets represented by networks; it was also developed for other projects at our research group at MTA SZTAKI. PinWallVis was further tweaked to serve VAST 2010 Challenge properly. With different algorithmic engines PinWallVis is able not just to visualize the graph using several different layouts but also to provide several further opportunities to work with the graph such as searching in the entities originated from the data, adding extra edges, finding the shortest paths between nodes.

Video:

Video

ANSWERS:


MC1.1: Summarize the activities that happened in each country with respect to illegal arms deals based on a synthesis of the information from the different report types and sources.  State the situation in each country at the end of the period (i.e. the end of the information you have been given) with respect to illegal arms deals being pursued.  Present a hypothesis about the next activities you expect to take place, with respect to the people, groups, and countries. 

To visualize information extracted from the text with PinWallVis we have defined two graphs based on the texts: the source graph and a social (and location) network graph. Suppose that the text can be split into smaller parts, reports. The source graph of the texts is a graph whose nodes correspond to the entities occurring in the texts and the reports. Edges can be present only between entities and reports. There is an edge between report R and an entity E if and only if the entity E is present in the report R.

First we used the named entity recognition visualization software to define the nodes and edges of the source graph. On illustration 1. the input text with the recognized and highlighted entities can be seen on the right, while the same entities sorted into tables are on the left.

Fig. 1: The NERVis tool

After defining the entities the graph could be loaded into the PinWallVis.

Fig. 2: PinWallVis: searching for Pakistan

From the search hit list we can select which entities we want to draw onto the screen. When clicking to a cell on screen its neighbors pop up. When more information is needed about a source the full text is shown when right click on the cell. (See on Illustration 4 right bottom side.) That way we could sort the needed data to acquire information asked for. It also helped that different layouts were developed, such as a special DateBasedSpringLayout, where cells which include a date field (such as almost every report’s title, and the recognized entities of date type) are arranged ordered along the x-axis such that earlier events are closer to the left side of the screen

Fig. 3: PinWallVis: Entities containing the work Pakistan and its sources arranged with DateBasedSpringLayout.

Fig. 4: PinWallVis: After clicking to all sources, all entities that were mentioned together with Pakistan can be seen.

We used PinWallVis to find the reports mentioning each county and organized the reports according to their temporal order. The series of events and recent situations could be built up for each county. By checking further reports and entities the story could be completed. The length of this process was somewhat longer than reading one time all reports (some reports connected to more than one country). The result is the following:

Thailand: On 10 Feb IL-76 cargo plane of Ukrainian owner Arkadi Borodinski, fro Bangkok carrying weapons from North Korea probably to Iran.

In the second half of 2008 at least a transaction of arms between Lim Chanarong having position in a rebel fraction of Burma and Nicolai Kuryakin’s group of Moscow with the mediation by Thai arms dealer Boonmee Khemkhaeng was completed.

Yemen: Key person involved in arms trade is Saleh Ahmed. In 2008 he likely delivered arms to the illegal local market in Yemen, especially in Sa’ada and he also tried to smuggle arms to Saudi Arabia, where the sale of personal firearms will be allowed as an attempt to undercut the illegal arms trade. The origin of the arms is connected to Russia, since 100 carat diamonds were shipped from Yemen to Moscow (probably to Leonid Minsky) by the money launderer Georgiy Giunter. After the shot of Leonid Minsky in Feb 2009 Mikhail Dombrovski took over LM’s business with SA. SA is supposed to die on 3 May 2009 after returned from a likely meeting with Nicolai Kuryakin in Dubai at 19 Apr 2009.

Ukraine: Related to the case of the cargo plane seized in Bangkok Igor Sviatoslavich were killed. Arkadi Borodinski’ business is probably based in Kiev. Ukrainian government had at least one shipment of arms to Kenya government, with a probably illegal re-trade by Kenya to Southern Sudan government.

Kenya: Three persons were accused, but later released in connection with raided British armoury. Two of them died in Nairobi 1 May after returned from a likely meeting with Nicolai Kuryakin in Dubai at 17 Apr 2009. The case may be also connected with raided police armoury used for local disturbances and as preparations for possible feud next year(s).
The cargo ship MV Tanya captured by pirates and later released for USD 3.2M was supposedly heading to
Kenya. Appearance of larger amount of Russian origin arms in Kenya is likely in the future.

Pakistan: After a successful counter-measurement by the Pakistan police in Feb 2008 the terrorist organization Lashkar-e-Jhangvi looked for new arms. At least two money transfers (series) (30 March and 17 Nov 2008) from the account probably belonged to Maulana Haq Bukhari, a prominent member of Lashkar-e-Jhangvi to a bank account connected to Mikhail Dombrovski, Russian arms dealer shows the increased activity of MHB to obtain new armoury. Meeting between them in Dubai in Apr 2009 forecasts further transaction.

Russia: Mikhail Dombrovski built new connection with near-government partners in Nigeria. The nature of the business seems fraud and money laundering, but arms dealing may also occur.

Venezuela: Successful action of Venezuelan police in Oct 2008 generates demands for new source for arms in the county. A buyer was connected to Mikhail Dombrovski of Russia by a middleman from Colombia. Money transfer was detected in Dec 2008 and a planned meeting in UAE in Apr 2009 predicts further deals.

Gaza/Lebanon: Martyrs Front of Judea plans to increased activity in May 2009, and looking for arms from multiple sources. Muhamed Kashem, a leader of MFJ and others organized a meeting to perform possible arms deal of Russian origin in Dubai for 18 Apr 2009.

Syria: Greater number of new recruits in a military training camp in Syria generates the demand of new arms. Their connection in Turkey using a middleman from Bosnia reached Russian arms dealers. The business was planned to be completed in Dubai at the middle of Apr 2009.

Since several prominent people in connection with illegal arm dealing died and some were arrested, new participants are expected to take part.


MC1.2:  Illustrate the associations among the players in the arms dealing through a social network.  If there are linkages among countries, please highlight these as well in the social network.  Our analysts are interested in seeing different views of the social network that might help them in counterintelligence activities (people, places, activities, communication patterns that are key to the network).

Two graphs were defined to analyze and visualize information gained from the input text by PinWallVis. The source graph was defined earlier. The social network is a graph whose nodes are persons appearing in any of the sources. Two nodes (persons) are connected if there is at least one source where they are both mentioned. The same way we have built a location network and a location – person network. In these graphs entities of the right type correspond to the nodes, and two entities are connected if they mentioned somewhere together.

The different layouts of PinWallVis help to analyze the data. The layout which makes the graph most apprehensible depends on the structure of the data and type of information the user want to gather. Small amount of data can be transparent with a circular layout. Often the most helpful layout is spring layout which is a force-based algorithm. The force-based algorithms purpose is to position the nodes of a graph in two dimensional or three dimensional space so that all the edges are of more or less equal length and there are as few crossing edges as possible.

We can easily gain extra information from the graph not just by looking at it, but also by using the menu. The nodes with the highest degrees can be listed. Supposing that a central person of the arm dealing social network is mentioned often with separate people, the node corresponding the given person ought to have a high degree. (Note that if a relatively large complete subgraph of the graph exists, it means that all persons corresponding two the nodes in the subgraph were mentioned together in a report. The fact that a report has listed several names does not mean that those people are central people in anything.)

Fig. 5: PinWallVis: The Minimap option is switched on so the graph can be seen as a whole with the part which is visible on the screen being marked.

On the above picture the social network graph can be seen. On the Minimap all cluster of the network can be noticed easily. On the left side of the screen a complete subgraph can be seen which stands for the people who took part in the civil disturbance in Pakistan. The other complete subgraph is on the bottom of the screen, these are the people who arrived on the caught cargo plane in Bangkok. The node which has the highest degree – not taking in consideration the ones who are mentioned in the report dealing with the civil disturbance is Nicholai Kuryakin.

We also have an option of adding an extra edge to the graph on screen. So if we want to add an edge between George Ngoki and Engr. Funsho Kapolalum (since they use the same email address as it can be seen easily from the source graph) it can be done easily.

Fig. 6: The graph of social and location network extracted from all text visualized with PinWallVis.

On the second picture the location and social network can be seen. In the main panel’s centre the cluster of the people in connection with Pakistan (e.g. Bhutani) and the locations in Pakistan (e.g. Lahore) can be seen. Persons’ cells are always colored green while location’s color is yellow. On the Minimap the structure of the graph can be easily analyzed. On the top right part of the screen the panel shows the list of the nodes with the highest degrees. Ukraine is the most often mentioned location, while Nicolai Kuryakin is again the most often mentioned person. The two persons following him with degree 17 is Mudassar Nausherwani and Azeem Bhutani. On the left side the people mostly associated with Karachi can be seen.

Fig. 7: Turkey: the phone number in the centre

Loading information in connection with Turkey the very center of the graph is the telephone number and textbooks. It seems the center person in this country has this phone number, and the only major businesses right now it to do with “textbooks”. All sources are of type “telephone and wired com”.

After analysis of the graphs the connection can be easily spotted, such as in Fig. 8. The neighbors of Moscow can be seen. The nodes for persons with highest degree is Dombrovski (in-degree 10) followed by Nicolai Kuryakin (in-degree 8). The locations with highest nodes (not taking Moscow into account) are Yemen (in-degree 12) and Dubai (in-degree 11) (separately). It can also be read from the graph that Dr. George, who seems to be a relatively new player has an obvious link to Mikhail Dombrovski.

Ukraine has also a major role in the arm dealing. The cargo plane from Bangkok, and the cargo ship captured in Somalia both has important links toward Ukraine.